107 research outputs found

    HistoMIL: A Python package for training multiple instance learning models on histopathology slides

    Get PDF
    Hematoxylin and eosin (H&E) stained slides are widely used in disease diagnosis. Remarkable advances in deep learning have made it possible to detect complex molecular patterns in these histopathology slides, suggesting automated approaches could help inform pathologists’ decisions. Multiple instance learning (MIL) algorithms have shown promise in this context, outperforming transfer learning (TL) methods for various tasks, but their implementation and usage remains complex. We introduce HistoMIL, a Python package designed to streamline the implementation, training and inference process of MIL-based algorithms for computational pathologists and biomedical researchers. It integrates a self-supervised learning module for feature encoding, and a full pipeline encompassing TL and three MIL algorithms: ABMIL, DSMIL, and TransMIL. The PyTorch Lightning framework enables effortless customization and algorithm implementation. We illustrate HistoMIL's capabilities by building predictive models for 2,487 cancer hallmark genes on breast cancer histology slides, achieving AUROC performances of up to 85%

    Pan-Cancer Survey of Tumor Mass Dormancy and Underlying Mutational Processes

    Get PDF
    Tumor mass dormancy is the key intermediate step between immune surveillance and cancer progression, yet due to its transitory nature it has been difficult to capture and characterize. Little is understood of its prevalence across cancer types and of the mutational background that may favor such a state. While this balance is finely tuned internally by the equilibrium between cell proliferation and cell death, the main external factors contributing to tumor mass dormancy are immunological and angiogenic. To understand the genomic and cellular context in which tumor mass dormancy may develop, we comprehensively profiled signals of immune and angiogenic dormancy in 9,631 cancers from the Cancer Genome Atlas and linked them to tumor mutagenesis. We find evidence for immunological and angiogenic dormancy-like signals in 16.5% of bulk sequenced tumors, with a frequency of up to 33% in certain tissues. Mutations in the CASP8 and HRAS oncogenes were positively selected in dormant tumors, suggesting an evolutionary pressure for controlling cell growth/apoptosis signals. By surveying the mutational damage patterns left in the genome by known cancer risk factors, we found that aging-induced mutations were relatively depleted in these tumors, while patterns of smoking and defective base excision repair were linked with increased tumor mass dormancy. Furthermore, we identified a link between APOBEC mutagenesis and dormancy, which comes in conjunction with immune exhaustion and may partly depend on the expression of the angiogenesis regulator PLG as well as interferon and chemokine signals. Tumor mass dormancy also appeared to be impaired in hypoxic conditions in the majority of cancers. The microenvironment of dormant cancers was enriched in cytotoxic and regulatory T cells, as expected, but also in macrophages and showed a reduction in inflammatory Th17 signals. Finally, tumor mass dormancy was linked with improved patient survival outcomes. Our analysis sheds light onto the complex interplay between dormancy, exhaustion, APOBEC activity and hypoxia, and sets directions for future mechanistic explorations

    A Comparison of Low Read Depth QuantSeq 3 ' Sequencing to Total RNA-Seq in FUS Mutant Mice

    Get PDF
    Transcriptomics is a developing field with new methods of analysis being produced which may hold advantages in price, accuracy, or information output. QuantSeq is a form of 3′ sequencing produced by Lexogen which aims to obtain similar gene-expression information to RNA-seq with significantly fewer reads, and therefore at a lower cost. QuantSeq is also able to provide information on differential polyadenylation. We applied both QuantSeq at low read depth and total RNA-seq to the same two sets of mouse spinal cord RNAs, each comprised by four controls and four mutants related to the neurodegenerative disease amyotrophic lateral sclerosis. We found substantial differences in which genes were found to be significantly differentially expressed by the two methods. Some of this difference likely due to the difference in number of reads between our QuantSeq and RNA-seq data. Other sources of difference can be explained by the differences in the way the two methods handle genes with different primary transcript lengths and how likely each method is to find a gene to be differentially expressed at different levels of overall gene expression. This work highlights how different methods aiming to assess expression difference can lead to different results

    How neurons migrate: a dynamic in-silico model of neuronal migration in the developing cortex

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Neuronal migration, the process by which neurons migrate from their place of origin to their final position in the brain, is a central process for normal brain development and function. Advances in experimental techniques have revealed much about many of the molecular components involved in this process. Notwithstanding these advances, how the molecular machinery works together to govern the migration process has yet to be fully understood. Here we present a computational model of neuronal migration, in which four key molecular entities, Lis1, DCX, Reelin and GABA, form a molecular program that mediates the migration process.</p> <p>Results</p> <p>The model simulated the dynamic migration process, consistent with in-vivo observations of morphological, cellular and population-level phenomena. Specifically, the model reproduced migration phases, cellular dynamics and population distributions that concur with experimental observations in normal neuronal development. We tested the model under reduced activity of Lis1 and DCX and found an aberrant development similar to observations in Lis1 and DCX silencing expression experiments. Analysis of the model gave rise to unforeseen insights that could guide future experimental study. Specifically: (1) the model revealed the possibility that under conditions of Lis1 reduced expression, neurons experience an oscillatory neuron-glial association prior to the multipolar stage; and (2) we hypothesized that observed morphology variations in rats and mice may be explained by a single difference in the way that Lis1 and DCX stimulate bipolar motility. From this we make the following predictions: (1) under reduced Lis1 and enhanced DCX expression, we predict a reduced bipolar migration in rats, and (2) under enhanced DCX expression in mice we predict a normal or a higher bipolar migration.</p> <p>Conclusions</p> <p>We present here a system-wide computational model of neuronal migration that integrates theory and data within a precise, testable framework. Our model accounts for a range of observable behaviors and affords a computational framework to study aspects of neuronal migration as a complex process that is driven by a relatively simple molecular program. Analysis of the model generated new hypotheses and yet unobserved phenomena that may guide future experimental studies. This paper thus reports a first step toward a comprehensive in-silico model of neuronal migration.</p

    Molecular landscape of esophageal cancer: implications for early detection and personalized therapy

    Get PDF
    Esophageal cancer (EC) is one of the most lethal cancers and a public health concern worldwide, owing to late diagnosis and lack of efficient treatment. Esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC) are main histopathological subtypes of EC that show striking differences in geographical distribution, possibly due to differences in exposure to risk factors and lifestyles. ESCC and EAC are distinct diseases in terms of cell of origin, epidemiology, and molecular architecture of tumor cells. Past efforts aimed at translating potential molecular candidates into clinical practice proved to be challenging, underscoring the need for identifying novel candidates for early diagnosis and therapy of EC. Several major international efforts have brought about important advances in identifying molecular landscapes of ESCC and EAC toward understanding molecular mechanisms and critical molecular events driving the progression and pathological features of the disease. In our review, we summarize recent advances in the areas of genomics and epigenomics of ESCC and EAC, their mutational signatures and immunotherapy. We also discuss implications of recent advances in characterizing the genome and epigenome of EC for the discovery of diagnostic/prognostic biomarkers and development of new targets for personalized treatment and prevention

    Genomic hallmarks and therapeutic implications of G0 cell cycle arrest in cancer

    Get PDF
    BACKGROUND: Therapy resistance in cancer is often driven by a subpopulation of cells that are temporarily arrested in a non-proliferative G0 state, which is difficult to capture and whose mutational drivers remain largely unknown. RESULTS: We develop methodology to robustly identify this state from transcriptomic signals and characterise its prevalence and genomic constraints in solid primary tumours. We show that G0 arrest preferentially emerges in the context of more stable, less mutated genomes which maintain TP53 integrity and lack the hallmarks of DNA damage repair deficiency, while presenting increased APOBEC mutagenesis. We employ machine learning to uncover novel genomic dependencies of this process and validate the role of the centrosomal gene CEP89 as a modulator of proliferation and G0 arrest capacity. Lastly, we demonstrate that G0 arrest underlies unfavourable responses to various therapies exploiting cell cycle, kinase signalling and epigenetic mechanisms in single-cell data. CONCLUSIONS: We propose a G0 arrest transcriptional signature that is linked with therapeutic resistance and can be used to further study and clinically track this state

    Genomic Analysis of Response to Neoadjuvant Chemotherapy in Esophageal Adenocarcinoma

    Get PDF
    Neoadjuvant therapy followed by surgery is the standard of care for locally advanced esophageal adenocarcinoma (EAC). Unfortunately, response to neoadjuvant chemotherapy (NAC) is poor (20-37%), as is the overall survival benefit at five years (9%). The EAC genome is complex and heterogeneous between patients, and it is not yet understood whether specific mutational patterns may result in chemotherapy sensitivity or resistance. To identify associations between genomic events and response to NAC in EAC, a comparative genomic analysis was performed in 65 patients with extensive clinical and pathological annotation using whole-genome sequencing (WGS). We defined response using Mandard Tumor Regression Grade (TRG), with responders classified as TRG1-2 (n = 27) and non-responders classified as TRG4-5 (n =38). We report a higher non-synonymous mutation burden in responders (median 2.08/Mb vs. 1.70/Mb, p = 0.036) and elevated copy number variation in non-responders (282 vs. 136/patient, p < 0.001). We identified copy number variants unique to each group in our cohort, with cell cycle (CDKN2A, CCND1), c-Myc (MYC), RTK/PIK3 (KRAS, EGFR) and gastrointestinal differentiation (GATA6) pathway genes being specifically altered in non-responders. Of note, NAV3 mutations were exclusively present in the non-responder group with a frequency of 22%. Thus, lower mutation burden, higher chromosomal instability and specific copy number alterations are associated with resistance to NAC

    Mathematical and Statistical Techniques for Systems Medicine: The Wnt Signaling Pathway as a Case Study

    Full text link
    The last decade has seen an explosion in models that describe phenomena in systems medicine. Such models are especially useful for studying signaling pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to showcase current mathematical and statistical techniques that enable modelers to gain insight into (models of) gene regulation, and generate testable predictions. We introduce a range of modeling frameworks, but focus on ordinary differential equation (ODE) models since they remain the most widely used approach in systems biology and medicine and continue to offer great potential. We present methods for the analysis of a single model, comprising applications of standard dynamical systems approaches such as nondimensionalization, steady state, asymptotic and sensitivity analysis, and more recent statistical and algebraic approaches to compare models with data. We present parameter estimation and model comparison techniques, focusing on Bayesian analysis and coplanarity via algebraic geometry. Our intention is that this (non exhaustive) review may serve as a useful starting point for the analysis of models in systems medicine.Comment: Submitted to 'Systems Medicine' as a book chapte

    Arena3D: visualizing time-driven phenotypic differences in biological systems

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Elucidating the genotype-phenotype connection is one of the big challenges of modern molecular biology. To fully understand this connection, it is necessary to consider the underlying networks and the time factor. In this context of data deluge and heterogeneous information, visualization plays an essential role in interpreting complex and dynamic topologies. Thus, software that is able to bring the network, phenotypic and temporal information together is needed. Arena3D has been previously introduced as a tool that facilitates link discovery between processes. It uses a layered display to separate different levels of information while emphasizing the connections between them. We present novel developments of the tool for the visualization and analysis of dynamic genotype-phenotype landscapes.</p> <p>Results</p> <p>Version 2.0 introduces novel features that allow handling time course data in a phenotypic context. Gene expression levels or other measures can be loaded and visualized at different time points and phenotypic comparison is facilitated through clustering and correlation display or highlighting of impacting changes through time. Similarity scoring allows the identification of global patterns in dynamic heterogeneous data. In this paper we demonstrate the utility of the tool on two distinct biological problems of different scales. First, we analyze a medium scale dataset that looks at perturbation effects of the pluripotency regulator Nanog in murine embryonic stem cells. Dynamic cluster analysis suggests alternative indirect links between Nanog and other proteins in the core stem cell network. Moreover, recurrent correlations from the epigenetic to the translational level are identified. Second, we investigate a large scale dataset consisting of genome-wide knockdown screens for human genes essential in the mitotic process. Here, a potential new role for the gene <it>lsm14a </it>in cytokinesis is suggested. We also show how phenotypic patterning allows for extensive comparison and identification of high impact knockdown targets.</p> <p>Conclusions</p> <p>We present a new visualization approach for perturbation screens with multiple phenotypic outcomes. The novel functionality implemented in Arena3D enables effective understanding and comparison of temporal patterns within morphological layers, to help with the system-wide analysis of dynamic processes. Arena3D is available free of charge for academics as a downloadable standalone application from: <url>http://arena3d.org/</url>.</p
    corecore